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@ An apparatus capable of reproducing, the voice 
of reading aloud a book for, a sufficient period of 
time. At the time of reproductionfi 'thei speed of 
reading is variable, and the output 'voice is normal as 
in the normal speed reproduction; The apparatus 
comprises means for digitally storing voice signals in 
a mode that the voiceless parts are coded substan- 
tially; means for generating voices by reproducing 
the digital voice signals at a desired ^ speed of 
speech; and means for adjusting the desired speed 
of the speech. 
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TECHNICAL FIELD 

The present invention relates to a voice record- 
ing/reproducing device and, nnore specifically, to a 
voice reproducing device such as a device, for 
learning and education or a voice- producing elec- 
tronic book and the like. ' ' 

BACKGROUND ART 

There has been proposed a device for re- 
producing voice by using a digital large-capacity 
recording mediunn such as CD-ROM or the like, the 
reproducing tinr^e, however, lasting for about 70 
minutes at the longest. The reproduction of such a 
length nnay be enough for recording music but is 
not enough for recording the whole recitation of 
such a book as a paperback, a study book or the 
like. In particular, in the case of a learning device 
which is used by a user for understanding and 
recognition by repetitively playing the device and 
reproducing a clear voice while changing the speed 
such as lowering the speed for extended periods of 
time, the above-mentioned digital recording me- 
dium is far from being of practical use unless the 
contents of the study material are adjusted or omit- 
ted. Further increased difficulty is encountered 
when other storage media are used. 

In addition, it was often required to intentionally 
adjust the speed of recitation during recording. 

The object of the present invention is to pro- 
vide a voice recording/reproducing device which is 
free from the above-mentioned problems inherent 
in the prior art, enables the voice data to be stored 
in sufficient amounts in a predetermined recording 
medium and further enables the voice output which 
is close to natural recitation to be obtained for an 
extended period of time at the time of the re- 
production. 

DISCLOSURE OF THE INVENTION 

In order to accomplish the above-mentioned 
object, the voice recording/reproducing device ac- 
cording to the present invention basically has the 
following technical constitutions. 

That is, a voice reproducing device comprising 
a voice signal recording means for recording a 
voice signal of which a voiceless part is converted 
into a predetermined voiceless part indication data 
signal, and a voice reproducing means which re- 
produces the recorded voice signal at a desired 
speed of speech. Concretely, a voice record- 
ing/reproducing device comprising a voiceless part 
indication conversion means which converts a 
voiceless part included in an input voice signal into 
a predetermined voiceless part indication data sig- 
nal, and a recording means for recording an input 
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voice signal including the. voiceless part indication . ' 
data signal converted by the voiceless part Indica- 
tion conversion means. More' concretely, a voice 
recording/reproducing device comprising a voice- 

5 less part indication conversion means which con- 
verts a voiceless part included in an input voice 
signal into a predetermined voiceless part indica- 
tion data signal, a recording means for recording 
an input voice signal including the voiceless part 

10 indication data signal converted by the voiceless 
part indication conversion means, and a voice re- 
producing means for reproducing an input voice 
signal recorded in said recording means at a de- 
sired speed of speech, is provided. 

16 Through intense study, the present inventors 

have realized a voice recording/reproducing device 
that can be used for a device for learning and 
education or for a voice-producing electronic book, 
by recording digital voice data in a recording me- 

20 dium by substantially deleting voiceless part, by 
adding voiceless time at the time of the reproduc- 
tion, thus enabling the voice data to be recorded in 
sufficient amounts in the recording medium and. 
besides, making it possible to obtain voice output 

25 close to natural recitation for extended periods of 
time as a result of addition of the voiceless time 
during the reproduction. 

BRIEF DESCRIPTION OF DRAWINGS 

30 

Fig. 1 is a diagram illustrating a recording 
means according to an embodiment of the 
present invention: 

Fig. 2 is a diagram illustrating a reproducing 
35 means according to an embodiment of the 
present invention; 

Figs. 3(A). 3(B). 4(A) and 4(B) are diagrams for 
explaining an enibodiment of the present inven- 
tion;. 

40 Figs. 5(A) and 5(B) are diagrams illustrating a 
method of discriminating the length of a voice- 
less part according to the present invention; 
Fig. 6 is a diagram illustrating another embodi- 
ment of the present invention; ' 

45 Fig. 7 is a flow chart which concretely illustrates 
a voiceless part processing unit shown In Fig. 6; 
Figs. 8(A) and 8(B) are diagrams illustrating an- 
other example of signals when the voiceless part 
is substantially deleted; 

50 Fig. 9 is a block diagram illustrating the constitu- 
tion according to another embodiment of the 
present invention; 

Fig. 10 is a flow chart for discriminating the 
voiceless part according to the present inven- 
55 tion; 

Figs. 11 and 12 are flow charts illustrating the 
procedures of operation for executing the meth- 
od of deleting a voiceless part according to 



2 



3 EP 0 

another embodiment ot Fig. 9; 
Figs. 13(A) to 13(C) are diagrams illustrating 
examples tor discriminating plus/minus compo- 
nent modes according to the present invention; 
Fig. 14 is a flow chart for explaining the proce- 
dure of operation by a recombination means 
according to the embodiment of Fig. 9; and 
Fig. 15 is a diagram of waveforms illustrating an 
input voice signal used in the present invention. 

Best Mode for Carrying Out the Invention 

A voice recording/reproducing device accord- 
ing to an embodiment of the present invention will 
now be described in detail with reference to the 
drawings. 

Fig. 1 is a block diagram which schematically 
illustrates the constitution of a recording means 
100 for recording input voice signals in a voice 
recording/reproducing device according to an em- 
bodiment of the present invention. The recording 
means 100 of Fig. 1 is constituted by a voiceless 
part indication conversion means 5 which converts 
a voiceless part included in an input voice signal 
into a predetermined voiceless part indication data 
signal, and a recording means 11 for recording an 
input voice signal including the voiceless part in- 
dication data signal converted by the voiceless part 
indication conversion means 5. 

The above embodiment will now be described 
in further detail. 

In Fig. 1. reference numeral 11 denotes a re- 
cording medium which chiefly consists of a digital 
recording medium such as an optica! disc, a mag- 
neto-optical disc or a magnetic disk, and 111 de- 
notes a writing means constituted by a writing 
head, a head-driving driver, and the like. Reference 
numeral 1 denotes an analog voice input means 
constituted by a microphone, a filter, an amplifier 
and the like, and 2 denotes an A/D conversion 
means which converts an analog voice signal into a 
digital voice signal. The A/D conversion means 2 
will often be provided with a digital signal compres- 
sion means such as ADPCM. Reference numeral 3 
denotes a voiceless part detection means which 
detects the voiceless part automatically or visually, 
and 4 denotes a conversion unit which receives 
output signals from the voiceless part detection 
means 3 and from the A/D conversion means 2, 
and deletes the voiceless part of the digital voice 
signal or converts it into another code based upon 
an input signal from the voiceless part detection 
means 3. According to the present invention, the 
voiceless part detection means 3 and the conver- 
sion unit 4 constitute the voiceless part indication 
conversion means 5. The conversion unit 4 may 
execute an algorithmic processing by using a CPU 
or a DSP. In this case, there will no be distinction 
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between the two means 3 and 4. Fig. 1 illustrates 
the constitution in which the analog voice is first 
converted into a digital voice, and then the voice- 
less part is substantially deleted. The constitution, 
5 however, is not necessarily limited thereto only but 
may be one in which the voiceless part is substan- 
tially deleted during the step of digital conversion 
or at the time of producing analog voice. 

In the present invention, the voiceless part in 
70 an input voice signal that is handled is a silent part 
or a part close to silent state between syllables or 
between clauses. In the present invention, the 
voiceless part indication conversion means 5 sub- 
stantially deletes the voiceless part, i.e., deletes alt 
76 or part of the voiceless part or converts the voice- 
less part into other code. 

The method of converting such a voiceless part 
into other code may comprise, for example, con- 
verting it into a signal that deletes the voiceless 
20 part, or converting it into a signal of data related to 
the time of the voiceless part, or converting it into a 
signal of data that indicates the position where it 
exists in the Input voice signal that contains the 
voiceless part. 

25 Fig. 2 schematically illustrates the constitution 

of a voice reproducing means 200 used in the 
voice recording/reproducing device of the present 
invention. 

Referring to Fig. 2, provision is made of a 
30 voice reproducing means 17 which reproduces at a 
desired speed of speech the input voice signals 
that are voice data stored in the recording medium 
11 by the voice recording means 100 shown in Fig. 
1. 

35 Described below is the concrete constitution of 

the voice reproducing means 200 used in the voice 
recording/reproducing device of the present inven- 
tion. That is. Fig. 2 illustrates the voice reproducing 
means 200 which hereinafter is referred to as re- 
40 production unit, and wherein reference numeral 11 
denotes the recording medium that is shown in Fig. 
1. 

Reference numeral 112 denotes a reading 
means which is constituted by a reading pick-up, a 
45 means for turning the recording means 11, and a 
means for sliding the reading pick-up. 

Reference numeral 12 denotes a detection 
means which detects the voiceless part that has 
been substantially deleted from the digital voice 
50 output from the reading means 112. restores the 
detected voiceless part, or newly forms the voice- 
less part, or converts it into a signal having an 
equivalent meaning, and outputs it. 

Reference numeral 13 denotes an adjusting 
55 means which combines a digital voice signal output 
from the reading means 112 and a voiceless signal 
output from the detection means 12, and outputs a 
combination signal. The detection means 12 and 

3 
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the adjusting means 13 constitute the voice re- 
producing means 17 of the present invention. The 
adjusting means 13 may often be algorithmically 
implemented by using a CPU or a DSP one-chip 
microcomputer. In this case, there is no need to 
distinguish the two means 12 and 13, and these 
means need have an algorithm such as a program 
routine which converts at least the deleted voice- 
less part into any voiceless time or a voiceless 
digital signal having the original voiceless time, and 
outputs it in combination with the digital voice. 

Reference numeral 14 denotes a D/A conver- 
sion means which converts the digital voice output 
from the adjusting means 13 into an analog voice. 
In this case, when the A/D conversion means 2 
shown in Fig. 1 has a compression means, the D/A ' 
conversion means 14 has a restoration means. The 
D/A conversion means 14 may constitute the 
above-mentioned voice reproducing means 17, or 
may serve as the detection means 12 or the adjust- 
ing means 13. * 

Reference numeral 15 denotes an amplifier 
means which electrically amplifies the analog 
voice. The amplifier means 15 may have frequency 
filtering characteristics. 

Reference numeral 16 denotes a voice genera- 
tion means which is constituted by either a speaker 
or an earphone or by both of them. The recording 
unit and the reproducing unit may be constituted 
together as a unitary structure or may be sepa- 
rately constituted. 

Described below with reference to Figs. 1 and 
2 are operations of the recording means 100 and of 
the reproducing means 200 in the voice record- 
ing/reproducing device of the present invention. 

In the recording unit 100 shown in Fig. 1. an 
analog voice input to the analog voice input unit 1 
is filtered, amplified, and is then converted through 
the A/D conversion means" 2 into a digital voice 
signal as shown in Fig. 3(A). The digital voice 
signal is input to the voiceless part detection 
means 3 and to the conversion means 4. The 
voiceless part detection means 3 detects a voice- 
less part 31 as shown in Fig. 3(A), and the conver- 
sion means 4 substantially deletes the voiceless 
part 31 as designated at 32 in Fig. 3(B) and. 
instead, converts it into a digital voice signal 32 
(Fig. 3(B)) that indicates a position where the voice- 
less part 31 is substantially deleted, and writes it 
into the recording medium 11 via the writing means 
111. 

A sequence of digital voice signals is very 
complex and is. hence, roughly drawn. A sequence 
of digital voices covers a syllable, a clause, a 
breath group, or from a voiceless part to a next 
voiceless part. 

Referring next to Fig. 2 which illustrates the 
reproducing means 200 for reproducing digital 



voice signals recorded in the recording medium 11, 
the recording means 11 Is connected to the read- 
ing means 112, and digital voice signals are read 
out from the recording means 11 while substan- 
5 tially deleting the voiceless parts and are output to 
the detection means 12 and to the adjusting means 

13 that are constituting the voiceless part indication 
conversion means 17. The detection means 12 
detects a voiceless part indication data signal such 

to as the voiceless part 32 that has been deleted as 
shown in Fig. 3(B), converts it into a voiceless 
digital signal having a predetermined time interval 
or the original time interval, and outputs it to the 
adjusting means 13. The adjusting means 13 com- 

?5 bines the deleted portion of the digital voice signal 
from which the voiceless part has been substan- 
tially deleted that is input from the recording 
means 11 with the voiceless digital signal input 
from the detection means 12, and outputs the 

20 combined digital voice signal (Fig. 3(A)) to the D.'A 
conversion means 14. The D/A conversion means 

14 converts the combined digital voice signal that 
is input into an analog voice signal. The amplifier 
means 15 amplifies the analog voice signal, filtrates 

25 it depending upon the case, and outputs it to the 
voice generation means 16 which outputs the voice 
using a speaker or an earphone as a medium. At 
this moment, the speech speed can be freely ad- 
justed by numerically adding or subtracting the 

30 amount of voiceless digital signals, and slow-speed 
speech can be easily realized. A knob for adjust- 
ment will often be provided on the device so that 
the user is allowed to effect the adjustment. 

Moreover, a digital voice signal from which 

35 voiceless part is substantially deleted and which 
includes a suitable conversion indication data sig- 
nal may often be recorded into the recording me- 
dium as shown in Figs. 4(A) and 4(B). In the 
recording unit of Fig. 1, the voiceless part detection 

40 means 3 and the conversion means 4 replace the 
voiceless part (41) in the original digital voice sig- 
nal shown in Fig. 4(A) by other code 42 as shown 
in Fig. 4(B). The digital voice signal shown in Fig. 
4(B) is written into the recording means 1 1 via the 

45 writing means 111. The other code 42 shown in 
Fig. 4(B) consists of several bits serving not only 
as a simple indication but also as data related to a 
voiceless time interval and data representing the 
nature of the voiceless part. 

50 In the reproduction unit 200 of Fig. 2, the 

recording means 11 records digital voice signals 
shown in Fig. 4(B). The reading means 112 reads 
out the digital voice signals recorded in the record- 
ing means 11, from which the voiceless part is 

55 substantially deleted, and outputs them to the de- 
tection means 12 and to the adjusting means 13. 

The detection means 12 detects a code that 
has been substituted for the voiceless part deleted 
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from the input digital voice signal, decodes the 
code, and outputs a signal that connplies with the 
decoded content to the adjusting means 13. The 
content of the other code 42 shown in Fig. 4(B) is a 
time width of the original voiceless part or the like 
as described above. In response to the signal input 
from the detection means 12 and the digital voice 
signal from which the voiceless part has been 
deleted input from the reading means 112» the 
adjusting means 13 outputs to the D/A conversion 
means 14 a digital voice signal (Fig. 4(A)) to which 
the voiceless part is added or in which the voice- 
less part is reproduced. The subsequent operation 
of the D/A conversion means 14 is the same as the 
one mentioned earlier and is not described here. 

Next, described below is another algorithm of 
the present invention which detects the voiceless 
part from the analog voice input signal, substan- 
tially deletes the voiceless part, and replaces the 
duration thereof by a code that represents a time 
function. Fig. 15 illustrates an input voice signal 
used in the present invention to which only, how- 
ever, the input voice signal is in no way limited. 

In the recording unit of Fig. 1. a window (WD) 
of Fig. 5(A) is set in advance for the voiceless part. 
Lth denotes a threshold value for judging the signal 
to be voiceless and has been set in the ( + )(-) 
directions. Signs A to D of Fig. 5(A) has been 
determined in advance, and initial values of time 
widths among the signs A to D have been deter- 
mined in advance, too. The time widths are only 
initial values and can be varied. At the present 
moment ts, a minimum tn is found that satisfies the 
following relation (1) from a moment ts + 1 to a 
moment ta, 

I V(tn) - V(ts) |> Lth (1) 

When tn is not found, the sign A is selected, 
and a next window (WD1) is set with the sign A as 
the present moment ts, and the operation is carried 
out to find a next tn in the window shown in Fig. 5- 

(A). 

In other cases, for instance, the sign B is 
selected when tb<tn^ta and, then, no sign is im- 
parted. Similarly, hereinafter, the sign C is selected 
when ts + 2<tn<tb, the sign D is selected when 
tnSts + 2 and, then, no sign is imparted. 

Then, the processing for deleting a voiceless 
part is resumed when | V(ti) - V(ts) | ^ Lth. At this 
moment, a sign Is imparted to indicate the resump- 
tion. 

When the sign A is selected repetitively or at a 
high frequency, the whole or part of the time widths 
among the signs A to D is lengthened. 

V(ti) represents a voltage at a moment tl which 
precedes or succeeds the present moment ts by a 
predetermined amount of time. This embodiment 
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uses four signs A to D that can be expressed by 
using two bits. Therefore, the voiceless part on the 
recording means is substituted by a small amount 
of sign sequence. The number of signs should be 

5 small though there is no particular limitation. The 
signs A to D determined in the above-mentioned 
step are recorded into the recording medium 1 1 via 
the writing means 111. 

Described below with reference to Fig. 5(B) is 

10 another method of imparting a discrimination sign 
by detecting the time width of the voiceless part of 
the input voice signal according to the present 
invention. 

In this embodiment as shown in Fig. 5(B), a 

76 first window (WD1) is set, and four kinds of check 
points A to D having different time factors are set 
within the window. If the initial time of the window 
WD1 is ts, a check point D is disposed at a 
position that corresponds to the time ts + 1 . A time 

20 interval between the initial time and the check point 
D is denoted by 1/4 -At. 

Similarly, a check point C is disposed at a 
position that corresponds to the time ts + 2. and a 
time interval between the check point D and the 

25 check point D is denoted by 1/4* A. 

A check point B is disposed at a position 
corresponding to the time ts + 3. and a time interval 
between the check point C and the check point B 
is denoted by At. 

30 Moreover, the checkpoint A is disposed at a 

position corresponding to the time ts + 4, and a 
time interval between the check point B and the 
check point A is denoted by At. 

After the window ^Np^ is set, a voice signal N 

35 is input from an external unit, and an input voice 
voltage V(n) is compared with the above-mentioned 
threshold value Lth that has been determined in 
advance. 

In this embodiment, the voltage V(n) of the 

40 voice input signal becomes smaller than the thresh- 
old value Lth at the time ts, and the above-men- 
tioned relation (1) is satisfied. 

At this moment, the window WD1 is set, and 
the input voice voltage V(n) is compared with the 

45 threshold value Lth continuously maintaining a pre- 
determined sampling time interval during the time 
of inspection set in the window WD1 . 

In the example of Fig. 5(B), the input voice 
signal N satisfies the above-mentioned relation (1) 

60 within a time ts + 4 that is determined in advance in 
the window WDI. During this time, therefore, it is 
so judged that the voiceless part is continuing, and 
a discrimination sign, i.e., sign A is imparted to the 
voiceless part during that moment. 

55 In this embodiment, furthermore, when it is 

judged that the voiceless part is continuing at a 
moment when the time ts + 4 has passed, a next 
window WD2 is set at this moment. 

5 
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That is, the initial time of the window. I.e.. ts is 
started again at the time ts + 4 at the check point A. 

In this embodiment, furthermore, the voiceless 
pari continues in the window WDl. That Is, since It 
is expected that the voiceless part lasts long even 
in the next window WD2, the time of the second 
window WD2 is set to last longer than the time set 
for the first window WDl. 

That is. if the initial time of the window WD2 Is 
ts, the check point D Is disposed at a position 
corresponding to the time ts + 1, and a time interval 
between the Initial time and the check point D Is 
denoted by 1/4 -At. 

Similarly, the check point C is disposed at a 
position corresponding to the time ts + 2, and a 
time interval between the check point D and the 
check point C is denoted by 1/4«At. 

The check point B Is disposed at a position 
corresponding to the time ts + 3, and the time inter- 
val between the check point C and the check point 
B is denoted by 2At + 3. 

Furthermore, the checkpoint A Is disposed at a 
position corresponding to the time ts + 4, and the 
time interval between the check point B and the 
check point A is denoted by 2At + 3. 

In this embodiment, the voltage V(n) of the 
input voice signal exceeds the threshold value Lth 
just before the check point A in the second window 
WD2, from which it will be learned that the voice- 
less part is finished. . ' 

For this purpose, the sign B which is the dis- 
crimination sign is imparted to the voiceless part of 
the input voice signal in the window WD2. 

In this embodiment, therefore, a sigh sequence 
AuEB is imparted to the voiceless part of the input 
voice signal, and the discrimination signs are read 
at the time of reproduction in order to execute the 
reproduction operation while inserting the voiceless 
part in a predetermined position of a predeter- 
mined input voice signal that Is reproduced for a 
period of time that corresponds to the discrimina- 
tion signs A*B. 

Described below is the operation of the re- 
producing means 200 that reproduces the record- 
ing means of Fig. 2 in which digital voice has been 
recorded. 

A digital voice signal recorded in the recording 
means ii is read out by the reading means 112 
and is input to the detection means 12 and to the 
adjusting means 13. The detection means 12 de- 
tects the signs A to D or a signal and a sign that 
indicate the start of voiceless part shown In Fig. 5- 
(A), applies them to the window shown in Fig. 5(A). 
restores the voiceless part having a time width 
corresponding to the sign, and outputs it to the 
adjusting means 13 which inserts the voiceless part 
output from the detection means 12 in the portions 
of signs A to D of the digital voice. When the 



detection means 12 detects the sign A repetitively, 
part or all of the time widths of signs A to D shown 
in Fig. 5(A) is lengthened, and the time width of the 
restored voiceless part is automatically lengthened 

5 In proportion to the number of repetitions. 

As described above, the voiceless part is auto- 
matically replaced by a small amount of signs at 
the time of recording, and the voiceless time is 
correctly restored very conveniently and ideally 

10 even by using a small amount of signs at the time 
of reproduction. In addition, the processing time for 
restoration is so short that the reproduced voice 
output is not interrupted. 

Imparting . the above-mentioned signs A to D 

76 and the content of processing based on the signs 
are only illustrative of the invention, and it should 
be noted that the invention is in no way limited 
thereto. 

The device constituted by using the above- 
20 mentioned embodiment should be of a portable 
size. The device constituted in the form of a study 
book may additionally have a function for repet- 
itively outputting the voice and a function of a 
bookmarker. The size of the device is determined 
25 depending upon the size of the recording medium. 
Therefore, the recording medium should be small 
in size and should have a large capacity as repre- 
sented, for example, by a CD-ROM. a magneto- 
optical mini-disc, a 3.5-inch floppy disk, a digital 
30 audio tape or the like. 

The digital voice may be obtained by subject- 
ing a synthesized voice or natural voice to A/D 
conversion or compression without any limitation, 
which is converted by an existing system. 
35 Next, described below is a method of deter- 

mining whether a voiceless part is included in the 
input voice signal or not according to the proce- 
dure shown in a flow chart of Fig. 10 which is 
based on the use of software. The device basically 
40 employs the constitution shown in Fig. 1. First, a 
step 2a inputs a predetermined amount of voice 
data that has been digitized. Here, the predeter- 
mined amount is a number such as 1024 units 
which varies depending upon a storage element for 
45 temporary storage. However, the predetermined 
amount may often not be used. 

In order to distinguish voice data from a control 
code that Is set to be used as a sign for represent- 
ing the voiceless part in a step 2b, the data which 
50 is the same as. or is similar to, the control code is 
changed. This change Is accomplished by, for ex- 
ample, adding + 1 (increment) to the data. 

The content of the voice data Is changed by 
+ 1 (increment) without, however, at all affecting 
55 the reproduced output voice. Next, a step 2c ex- 
ecutes the calculation for determining the start and 
end of the voiceless part. The calculation is carried 
out by dividing voice data into predetermined block 
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units {e.g.. each block consists of 64 data) to obtain 
daia such as voice amplitude distribution in the 
block A ijiup 2a delects how nnuch the annplitude 
UisiiiUui:uM ul Udia obtained in the step 2c exists in 
a picOciufinmuO lange (e.g., 800h ^ X ^ 80Fh). If 
ihe daia in a block occupies more than 90% of the 
voiceless range, this is tentatively regarded to be 
the Stan ol a voiceless part (yes), and the answer 
(no) IS output in other cases. When the answer is 
(yes) at the step 2d, it is examined at a step 2e 
whether 0 voiceless part start flag is turned on or 
not. When the flag is not turned off. the start of the 
voiceless part is reliably determined at this mo- 
ment, whofoby the answer (no) is output and the 
voicoloss Mag is turned on at a point where the 
voiceless part is tentatively started (step 2f). When 
the voiceless part start flag has been turned on 
already m the step 2e, the data is just in the 
voiceless pan and the answer (yes) is output. A 
step ?g reads the next predetermined number of 
blocks and a step 2h determines whether the data 
blocks have been finished or not. When the data 
blocks have been finished, the answer (yes) Is 
output and when the data blocks have not been 
finished, the program returns back to the step 2c of 
calculating the voiceless range. When the data 
blocks have been finished, a step 21 determines 
whether all the data have been finished or not. 
When all the data have not been finished, the 
answer (no) is output, and the step 2a reads again 
1024 data, temporarily stores the data, and the 
processings subsequent to the step 2a are ex- 
ecuted. 

When the data is not in the voiceless range in 
the step 2d. the answer (no) is output, and a step 
2j determines whether this is the end of the voice- 
less pan or not. The state in which more than 50% 
of the predetermined number of data does not exist 
in the voiceless range serves as a reference of 
judgement. When it is judged at the step 2j that the 
voiceless part has finished (yes), the program pro- 
ceeds to a step 2k. When the answer is (no), the 
data is not the voiceless part and the program 
proceeds to the step 2g where the next predeter- 
mined number of data blocks are read. The step 2k 
judges whether the voiceless part start flag is 
turned on or not. When it is turned off, the answer 
(no) is output and the program proceeds to the 
step 2g. When the voiceless part start flag Is 
turned on, the answer is (yes) and the step 21 
calculates the number of bytes in the voiceless 
section. Here, when the interval between the point 
of tentatively starting the voiceless part and the 
point of tentatively ending th© voiceless part is 
longer than a predetermined period of time, there 
are set a point of truly starting the voiceless part 
and a point of truly ending the voiceless part. 
When a duration of the voiceless part that is set 
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depending upon the cases is shorter than the pre- 
determined period of time, then this duration is not 
regarded as a voiceless section and is neglected. 
This processing Is suited for preventing the occur- 

5 rence of offensive feeling during the reproduction 
that stems from when a voiceless state exists be- 
tween sound and another sound while the recitation 
is continuing and when this state Is regarded to be 
the object of deletion and is deleted. A step 2m 

10 determines a control code relying upon the detec- 
tion of the voiceless state and further determines a 
byte number code that represents the voiceless 
section, which are then stored In the voice data. A 
step 2n turns the voiceless part start flag off. Thus, 

75 a series of voiceless state detection processes are 
finished, and the program proceeds to the step 2g 
to read the next data block. 

Next, the voice recording/reproducing device 
according to another embodiment of the present 

20 invention will be described with reference to Figs. 6 
to 8. 

The voiceless part referred to in the present 
invention is a voiceless part or a part close to 
voiceless state between syllables or between 

25 clauses as mentioned earlier. The substantial en- 
coding of the voiceless part will be to convert the 
whole or part of the voiceless part designated at 31 
in Fig. 3(A) into a state designated at 32 in Fig. 3- 
(B), or to convert the voiceless part designated at 

30 41 in Fig. 4(A) into another sign 42 shown in Fig. 4- 
(B), or to collect the whole or part of the voiceless 
parts (Ml, M2, — , Mn, —) into a particular region 
as shown in Fig. 8(B) for the voice parts (01. 02, 
— , On, — ) that are shown in Fig. 8(A). 

35 Fig. 6 is a block diagram illustrating the con- 

stitution of the voice recording/reproducing device 
according to another embodiment of the present 
invention, wherein reference numeral 1 1 denotes a 
recording medium or a recording means which 

40 may be a magneto-optical disc such as a MD or 
MO, an optical disc such as a CD or MD, a 
magnetic disk or an IC memory medium, which is 
small in size and has a large capacity. 

Reference numeral 22 denotes a drive element 

45 such as a spindle motor or a sled motor for driving 
the recording medium 1 1 and a pick-up. The drive 
element may not be provided depending upon the 
kind of the recording medium. 

Reference numeral 23 denotes an RF amplifier 

50 which amplifies and shapes the signals that are 
read, 24 denotes an adjusting means which is 
constituted by a DSP or the like and corrects 
errors, and is further equipped with an EFM de- 
modulation means when a PLL or a general-pur- 

55 pose DC reproduction device is used, reference 
numeral 25 denotes a drive means which reads the 
number of revolutions of the drive element 22 and 
determines the positioning, 26 denotes a voiceless 

7 



13 



EP 0 652 560 A1 



14 



part conversion means which is constituted by a 
microcomputer or an ASIC and reproduces the 
encoded voiceless part depending upon the code 
thereof and an Input from an external unit, refer- 
ence numeral 27 denotes an input control means 
which is constituted by a microcomputer or the like 
and outputs a control signal that is based upon an 
input signal from an external input 71 and an 
output signal of the voiceless part conversion 
means 26 and feeds the control signal to the ad- 
justing means 24 and to the voiceless part conver- 
sion means 26. That Is, In this embodiment, a 
voiceless part indication conversion means 20 is 
constituted by the RF amplifier 23, adjusting means 
24 and voiceless part conversion means 26. 

Reference numeral 28 denotes a D/A conver- 
sion means which converts a digital voice signal 
into an analog signal and outputs it. When the data 
have been compressed such as by ADPCM or 
ATC. a restoration processing corresponding there- 
to will be further included. 

Reference numeral 29 denotes an amplifier 
means which amplifies an analog voice signal and 
outputs it. and reference numeral 30 denotes a 
voice output means which Is constituted by a 
speaker, an earphone or the like. 

The operation will now be described. 
A mixture signal consisting of a digital voice 
signal obtained from the recording medium 11 via 
the drive element 22 and encoded voiceless data, 
such as a signal 32 of Fig. 3(B) or a signal 42 of 
Fig. 4(B), is amplified and shaped through the RF 
amplifier 23, and is input to the adjusting means 
24. The adjusting means 24 corrects an error in the 
input data and effects EFIVI demodulation, and out- 
puts a signal to the drive circuit 25 to adjust the 
revolving speed of the recording medium and to 
adjust the movement of the pick-up. The voiceless 
part conversion means 26 detects encoded voice- 
less data from the output from the adjusting means 
24. converts it into voice data that represents a 
voiceless state and outputs it. The voiceless part 
conversion means 26 easily changes the range of 
voice data in the voiceless part In response to a 
signal from the input control means 27. Depending 
upon an Input by a key or by a control knob from 
the input 71 , the Input control means 27 permits a 
signal corresponding to the input to be fed to the 
adjusting means 24. For instance, an input by a 
key enables the adjusting means 24 to receive a 
signal that is based upon such an operation as a 
pause, finding the head or a repeat, in order to 
adjust the drive circuit 25. The input control means 
27 outputs a signal for adding or deleting a signal 
range that indicates the range of a voiceless part in 
order to Increase or decrease the speed of speech. 
The voice data output by the voiceless part conver- 
sion means 26 is converted into analog voice 



through the D/A conversion means 28. amplified 
through the amplifier 29 and is reproduced by the 
voice output means 30. When the recording me- 
dium 11 is read with the specifications of 44.1 KHz 
5 and a resolution of about 1 6 bits like CD players in 
general, and when the D/A conversion and the like 
are adjusted to accomplish a relatively low pro- 
cessing speed and to obtain a slow speech speed 
such as with 8 KHz and a resolution of 4 bits or 12 
10 bits during the restoration of a voiceless part, then, 
the voiceless part conversion means 26 may output 
a control signal to the Input control means 27 In 
order to stop or delay the reading even temporar- 
ily. At this moment, the input control means 27 
75 causes the adjusting means 24 to stop the reading 
operation while maintaining the revolving speed 
constant or to lower the revolving speed or to stop 
the operation. 

The operation of the voiceless part conversion 
20 means 26 will be concretely described with refer- 
ence to Fig. 7. After started, a step 21 receives a 
predetermined amount of digital voice data from 
the adjusting means 24 and temporarily stores 
them in a memory. The memory will not be used 
25 when the voiceless parts are sequentially pro- 
cessed. After a predetermined amount of the digital 
voice data are received, the voiceless part conver- 
sion means 26 will, depending upon the case, 
output a signal to the input control means 27 to 
30 stop or suppress the reading of data from the 
recording medium 1 1 . The input control means 27 
outputs this input signal to the adjusting means 24 
to control the drive element 22 and the like. 

After a predetermined amount of the voice data 
35 are received at the step 21 , a step 2a confirms the 
input of a parameter from the input control means 
27 and determines whether a parameter is input or 
not. When the answer is yes, a step 2b sets the 
number of parameters corresponding to the input. 
40 A step 22 confirms whether the voiceless part Is 
being output or not. This step confirms the state of 
a flag that indicates the output of voiceless part. 
When the flag indicating the output of voiceless 
part is raised (yes), the program proceeds to a step 
45 26 where the volwjeless part output processing is 
carried out. When the flag is cleared (no), the 
program proceeds to a next step 23 where a voice 
data Is found and a parameter is set when there 
exists a control code. Then, a step 24 determines 
50 whether the processing of voice data Is finished or 
not. When the processing is finished (yes), the 
operation is carried out to further read the data and 
comes to an end. When the operation is not fin- 
ished (no), the program proceeds to a step 25 
55 which determines whether it is a voiceless part 
control code or not. When it is not the voiceless 
part control code, the program proceeds to a step 
27 assuming that the data represents the voice. 
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When the data is the voiceless part control code, 
the program proceeds to the voiceless part output 
processing step 26. In the voiceless part output 
processing of step 26. the flag indicating the output 
of voiceless part is raised and a data (e.g.. 808h) is 
output to represent voice of a level 0. For a param- 
eter that indicates the voiceless section, further- 
more, a set parameter in a step 2b is added or 
subtracted and 1 is subtracted (decrement) from 
this value every time. As the added or subtracted 
parameter is decreased by 1 and becomes 0, the 
flag indicating the output of voiceless part is 
cleared and the voiceless part is no longer output. 
After the above two parameters are added or sub- 
tracted, the input parameter set at the step 2b is 
cleared. After the step 26 of the voiceless part 
output processing is finished, a step 32 waits for a 
time to process the next data. 

In this embodiment, the sampling frequency at 
the time of demodulation has been set to 8 (KHz). 
Therefore, the step 32 waits for a time of about 125 
microseconds. 

When the data at the step 25 is not the voice- 
less part control code (no), i.e., when the data is 
voice data, a step 27 determines whether the data 
is the one of an even number. When the data is of 
an even number (yes), a step 28 shifts the data by 
four bits, i.e., shifts the data to an odd number so 
that it can be processed in units of bytes. Then, a 
step 29 extends the voice data of 4 bits to the 
voice data of 12 bits. The extension algorithm is as 
described below. The data (Y) of a first time is 
calculated on the level 0 (808h) and in the second 
and subsequent times, the data obtained according 
to the following calculation is used as the data of 
the previous time, i.e.. 4-bit voice data (X) is sub- 
tracted from a reference value, multiplied by a 
multiplication factor (m) and to which the previous 
data (Y) of 12 bits is added to find 12-bit voice 
data. 

12-Bit voice data = ((X - reference value) x m) + 

Y 

After the voiceless part, the data of the pre- 
vious time is set to level 0. 

The reason why the data is extended to12 bits 
is because the voice is digitized with 12 bits, 
though there is no particular limitation as in the 
above-mentioned algorithm. The multiplication fac- 
tor (m) is added to the data during the recording in 
order to maintain accuracy at the time of restora- 
tion, and need not necessarily be used. Next, a 
step 30 detects the voice data of an odd number 
and checks if the voice data is finished at the time 
of an odd number only. This is to decrease the flag 
corresponding to the whole voice data by -1 (de- 
crement) and to finish the procedure at the step 24 
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when the flag becomes 0. Here, the parameters 
that are set for controlling the speed of speech or 
the operation for the parameters are not limited to 
the above-mentioned examples, and can be carried 
5 out by an interrupt or by any other method. As for 
generating the parameters, furthermore, their val- 
ues may have been stored in advance and a pa- 
rameter may be read out when an input is received 
from an external unit in order to set the speed of 
10 speech according to the parameter, or an input fed 
from an external unit may be read out and the 
speed of speech may be controlled based upon 
the input. Thus, the constitution and operation 
thereof can be suitably selected. Moreover, the 
15 manner of processing need not be based only 
upon software but may be based upon hardware as 
well. The device constituted according to the 
above-mentioned embodiment should preferably be 
of a portable size. When it is constructed as a 
20 study book, the device will additionally have a 
function of repeating the voice and a function of a 
bookmarker. The size of the device is determined 
depending upon the size of the recording medium. 
Therefore, the recording medium should be small 
25 in size and should have a large capacity as repre- 
sented, for example, by a CD-ROM. a magneto- 
optical mini-disc, a 3.5-inch floppy disk, a digital 
audio tape or the like. 

The digital voice may be obtained by subject- 
30 ing synthesized voice or natural voice to A/D con- 
version or compression without any limitation, by 
an existing system. 

The constitution of the recording means ac- 
cording to a further embodiment of the present 
35 invention will now be described with reference to 
Fig. 9, wherein reference numeral 91 denotes a 
voice input means which is constituted by a micro- 
phone that forms analog voice electric signals, a 
filter circuit, an amplifier circuit and the like, and 92 
40 denotes an A/D conversion means which is con- 
stituted by a sampling circuit, an A/D converter 
circuit and the like and may, as required, be further 
provided with circuits including a compression 
function such as PCM, PWM or ATC like ADPCM. 
45 Reference numeral 93 denotes a voiceless part 
detection means which changes, deletes or moves 
a code indicating the voiceless state that has been 
set in advance based on the digital voice data and 
data that are confused with parameter codes in- 
60 dicating duration, etc.. and detects voiceless part, 
makes the voiceless part correspond to the code 
indicating the voiceless state and to the parameter 
codes indicating duration, etc,, and adds the voice- 
less part in the voice data in an interrupting man- 
55 ner; 94 denotes a voiceless part deletion means 
which detects digital voice data by a number of 
voiceless s ction bytes that follow the voiceless 
part control code; and reference numeral 95 de- 
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noics a component control code conversion nneans 
whicn. at the lime of converting an analog voice 
into d CKjiiai voice and at the time of restoring the 
voicu. fucL'iUs 4-bii data into the recording medium 
by ubiny d t2-ljil conversion means, the compo- 
nent control code being used as a conversion 
paiametoi at the lime of effecting the conversion 
processing from 12 bits to 4 bits or vice versa. The 
component control code conversion means 95 fur- 
ther works to convert a portion of the digital voice 
signal having a large change or a characteristic 
portion iniD o component control code. A portion 
having a largo change of the voice signal or a 
charactorictic portion will not be restored to a suffi- 
cient dogroo unless the amount of digital data is 
increased Tho component control code conversion 
means 95 substitutes the data of this portion by the 
data representing quantity and number, i.e., by a 
componen: control code in order to suppress the 
amount of digital data, and works to obtain analog 
voice signals close to the original waveforms during 
the reprodiiciion when the voice data recorded in 
the recording medium has a number of bits of as 
small as about 4 bits. 

Reference numeral 96 denotes a recombination 
means which, when a control code such as voice- 
less control code and voice data exist being mixed 
together in a predetermined section, replaces the 
voiceless control code by voice data that repre- 
sents the voiceless state and stores the control 
code in the head of the next section. The state in 
which the voice data and the control code are 
mixed together in the predetermined section is a 
case where a processing means such as the CPU 
has a processing ability of 8 bits and the voice data 
sequence is divided by 8 bits so that it can be 
easily processed, and the voice data and the con- 
trol data are contained in the 8 bits. 

In such a state, the code of the lower 4 bits is 
placed at the head of the next predetermined sec- 
tion and, instead, a voice data that represents the 
voiceless state is substituted for that place, in order 
to distinguish the code and the voice data from 
each other. The recombination means also has a 
means for encoding the digital voice with 4 bits. 

Reference numeral 97 denotes a recording sig- 
nal adjusting means which changes a signal of a 
required form in response to an input so that it can 
be written into the recording medium. In this case, 
a customarily used means is included such as 
EFM modulation. 

Reference numeral 97p denotes a pick-up for 
writing which is suitably selected depending upon 
the kind of the recording medium 98. The record- 
ing signal adjusting means 97 is constituted by one 
or a plurality of DSPs and a microcomputer, and is 
suitably selected depending upon the size of the 
device and the processing ability. 



The operation procedure of the voiceless part 
deletion means 94 of Fig. 9 will now be described 
with reference to flow charts shown in Figs. 1 1 and 
12. A mixed data sequence consisting of voiceless 

5 signals and voice data is input by the voiceless 
part detection means 93 to the voiceless part dele- 
tion means 94 where voice data representing a 
voiceless state are deleted. 

A step 3a reads the mixed data from the voice- 

10 less part detection means 93 directly or via a buffer 
or the like. 

A step 3b discriminates whether the first data 
is a voiceless part detection control code or not. 
When it is a voiceless part detection control code 
75 (yes), the program proceeds to a next step 3d. 
When it is not (no), the voiceless part detection 
control code and the number 0 of bytes of the 
voiceless section are stored in buffer memory, and 
the program proceeds to a step 3d where it is 
20 judged whether the processing has finished for all 
of the data. When the processing has not finished 
(no), the mixture data sequence is read again. 

Then, a step 3f discriminates whether the data 
is a voiceless part detection control code or not. 
25 When it is the voiceless part detection control code 
(yes), a step 3h stores only the voiceless part 
detection control code and the number of bytes 
representing the voiceless section, and the data are 
skipped. Then, a step 3i advances the data pointer 
30 by the number of bytes of the voiceless section 
corresponding to the voiceless part detection con- 
trol code and deletes the data by the number of 
bytes of the voiceless section. When the data at 
the step 3f is not the voiceless part detection 
35 control code (no), it is so determined that the data 
that is read is the voice data which is then stored in 
the buffer. By repeating the above-mentioned pro- 
cessings, all of the data are rearranged into a data 
sequence of voice data, voiceless part detection 
40 control code and a code of the byte number of the 
voiceless section. After the data are sequentially 
deleted or are deleted by a predetermined amount, 
the output of the voiceless part deletion means 94 
is input to the component control code conversion 
45 means 95. 

Referring next to Fig. 12. a step 4a reads the 
data output from the voiceless part deletion means 
94. In the description of this embodiment, there are 
256 units of data to be read and that are processed 
50 as one unit. 

A step 4b discriminates whether the data to be 
processed is a voiceless part detection control 
code or not. When it is the control code (yes), the 
voiceless part detection control code and the num- 
55 ber of bytes of the voiceless section only are 
stored in the buffer, and the data are skipped and 
the program proceeds to a step 4d. When the data 
is not the voiceless part detection control code, the 
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program proceeds to the step 4d where a dif- 
ference is found between the previous data and the 
present data, and a maxinnunn value in the same 
mode is found from the differential data. The step 
4d discriminates whether the mode is changed. 
When the mode is changed (yes), the program 
proceeds to a step 4e. In the case of the same 
mode (no), the program proceeds to the step 4a 
where the next data Is processed. The step 4e 
discriminates whether the change of mode is to a 
plus/minus component mode. The plus/minus com- 
ponent mode is, as shown in Figs. 13(A) to 13(C), 
for data which changes little, has plus and minus 
components, and has a difference smaller, than a 
predetermined value relative to the neighboring 
data. 

In the step 4e, the answer (yes) is output when 
the data corresponds to the plus/minus component 
mode and the answer (no) is output when the data 
does not correspond to it. When (yes) is output, a 
plus-minus control code corresponding ..to the 
plus/minus component mode and a multiplication 
factor code are stored in. the voice code data. The 
multiplication code compensates for a lack of bit 
expression in the amourit of change in the voice 
data and expresses the amount of change jn the 
voice data relying upon the multiplication, ^factor. 
The reference of multiplication factor code is ar- 
bitrary and is suitably selected depending upon the 
digital bit expression of voice data and sample 
frequency. 

When the sample frequency is 8 KHz and 
when the voice data and control code digital data 
recorded in the recording medium have 4 bits, the 
multiplication factor is set as described below. The 
directivity of plus or minus is determined with .7 as 
a reference value. That is, the component has a 
minus sign for 0 to 6 and has a plus sign for 8 to 
i4(Eh). In this case, a maximum amount of change 
is 7. A multiplication factor of one. change, is. found . 
from a maximum . difference (B)(Fig. 13(A))- 
/maximum amount of change. 

When the answer at the step 4e is (no), the 
program proceeds to a step 4g which discriminates 
coincidence or non-coincidence of differential data 
in the plus component mode. When the differential 
data coincide with each other, the plus control code . 
corresponding to the plus component mode and 
the multiplication factor are . stored in the voice 
data. The multiplication factor at this moment is 
set, for example, as described below. Being ex- 
pressed by 1 to 1 4(Eh) with 0 as a reference value, 
a maximum amount of change is set to 14. 

Next, a multiplication factor (maximum differ- 
ence/maximum amount of change) is calculated to 
find a multiplication factor per change. At a step 
4h, the plus control code and the multiplication 
factor are stored in a portion of the voice data 



detected in the plus component mode.- 

When the answer at a step 4g is (no), a step 4i 
stores the minus control code and the multiplica- 
tion factor in a corresponding portion in the voice 

5 data. The multiplication factor of minus component 
does not require the data that indicates the direc- 
tivity and is expressed by 0 to 13(Dh) with 14 as a 
reference value. A maximum amount of change 
(maximum difference/maximum amount of change) 

10 is 14, and a multiplication factor-per a change is 
found. 

The data in which the voice data and the 
component multiplication factor control code are 
mixed are input to the recombination means 96. 
75 The recombination means 96 encodes the voice 
data into 4 bits from 12 bits in the mixed data and 
recombines the voice data and the control code. 

The operation of the recombination means 96 
of Fig. 9 will now be described with reference to a 
20 flow chart of Fig. 14. 

A step 6a reads the data and processes the 
data by dividing them into a predetermined num- 
, ber. A step 6b determines whether^the data is a 
control code.. When, the data isenotia control code 
25 (no), i.e.. when the data is voice .data. « a step 6c 
executes the .encoding .of 4 bits. : In this . case, the 
encoding of 4 bits may /be. effected based. upon the 
component and the data of the multiplication factor. 
Described below is a. method of conversion from 

30 12 bits to 4 bits 

The first time (n-1). the calculation is carried 
out on the level 0 (808h). and the second and 
subsequent times, the data obtained through the 
following calculation is used as the previous data. A 
35 difference in the data is divided by a . . . 

. multiplication factor (m) tOi obtain a .value of 
change necessary .for this. difference. . . 

1 _ ... • , • . 

4-Bit voice data = (((n) - (n-l))/m) + reference 
40 . value 

The component and the multiplication factor 
are determined by the control code conversion 
means (95) described earlier, and the component 

45 control code and the multiplication factor having 
signs indicating they have not yet been rewritten 
are effectively used until they are rewritten. • . . 

A step 6d determines whether the voice data 
encoded into 4 bits are of an even number or an 
J 60 . > odd number. When the voice data are of an even 
number (yes), the encoded data are stored in the 
upper 4-bit register. After the data are stored in the 
upper and lower bits, a step 6g stores in the buffer 
memory the encoded data of a total of one byte 

56 comprized of upper 4 bits and lower 4 bits. When 
the data is a control code at the step 6b, a step 6h 
determines whether the voice data is the end of 
data of an odd number. When it is the end of data 
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of the odd number (yes), data representing a voice- 
less state is stored in the control code and. at the 
sanne time, the control code Is stored In the next 
upper 4 bits. A step 6j additionally stores a compo- 
nent control code (kind, multiplication factor, etc.) s 
that accompanies the control code (Fh). When the 
voice data ends at an odd number at the step 6h 
(no), the processing of the step 6j is carried out. 
The data in which the voice data and the control 
code are rearranged are written into a temporary or io 
a main recording medium 18 via the recording 
signal adjusting means 97 and the pick-up 97P for 
writing. 

According to the present invention as de- 
scribed above in detail, even a recording medium 75 
that is generally used can be used for reproducing 
a voice leciting a book for a sufficiently long period 
of time while making it possible to freely change 
the speed of speech during the reproduction and to 
output voice which is little different from that of the 20 
ordinary reci'ation. ^ 

Through keen study, the present inventors 
have realized a voice recording/reproducing device 
that is capable of varying the speed of recitation 
such as of when the user wants to listen to the 25 
recitation of a study book or an explanatory book at 
a slow speed or when the user wants to listen to 
the recitation at a fast speed, by recording digital 
voice data in a recording medium in a manner In 
which the voiceless parts are substantially en- 30 
coded, by adding voiceless time at the time of the 
reproduction, thus enabling the voice data to be 
recorded in sufficient amounts in the recording 
medium and, in addition, making it possible to 
obtain voice output close to natural recitation for 36 
extended periods of time as* a result of addition of 
data related to the original voiceless part during the 
reproduction and addition of voiceless part data of 
a desired time duration. 

40 

Claims 

1. A voice reproducing device comprising a voice 
signal recording means for recording a voice 
signal of which a voiceless part is converted 45 
into a predetermined voiceless part indication 
data signal, and a voice reproducing means 
which reproduces the recorded voice signal at ' 
a desired speed of speech. 

50 

2- A voice recording/reproducing device compris- 
ing a voiceless part Indication conversion 
means which converts a voiceless part includ- 
ed in an input * voice ' signal Into a predeter- 
mined voiceless part indication data signal, 55 
and a recording means *for recording an input 
voice signal including the voiceless part indica- 
tion data signal converted by the voiceless part 



indication conversion means. 

3. A voice recording/reproducing device compris- 
ing a voiceless part indication conversion 
means which converts a voiceless part includ- 
ed in an input voice signal into a predeter- 
mined voiceless part indication data signal, a 
recording means for recording an input voice 
signal including the voiceless part indication 
data signal converted by the voiceless part 
indication conversion means, and a voice re- 
producing means for reproducing an input 
voice signal recorded in said recording means 
at a desired speed of speech. 

4. A voice recording/reproducing device accord- 
ing to any one of claims 1 to 3. wherein 
provision is further made of an adjusting 
means for adjusting the speed of speech dur- 
ing the reproduction of voice by said voice 
reproducing means. 

5. A voice recording/reproducing device accord- 
ing to any one of claims 1 to 4, wherein the 
voiceless part indication data signal in said 
voiceless part indication conversion means is a 
signal for deleting said voiceless part. 

6. A voice recording/reproducing device accord- 
ing to any one of claims 1 to 4, wherein the 
voiceless part indication data signal in said 
voiceless part indication conversion means is a 
signal representing data related to the tinnie of 
said voiceless part. 

7. A voice recording/reproducing device accord- 
ing to any one of claims 1 to 4, wherein the 
voiceless part indication data signal in said 

■ voiceless part indication conversion means is a 
signal representing data related to a position in 
said input voice signal in which said voiceless 
part is disposed. 

8. A voice recording/reproducing device accord- 
ing to any one of claims 1 to 4, wherein a 
plurality of said voiceless part indication data 
signals output from said voiceless part indica- 
tion conversion means are stored in a con- 
centrated manner at a predetermined position 
of said recording means. 

9. A voice recording/reproducing device accord- 
ing to any one of claims 1 to 8, wherein, in 
reproducing said input voice signals, said voice 
reproducing means reproduces said voiceless 
part indication data signals recorded in said 
recording means while inserting them in pre- 
determined positions in a sequence of said 
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input voice signals according to predetermined 
data that have been imparted to said voiceless 
part indication data signals. 



70 



75 



20 



25 



30 



35 



40 



45 



50 



55 



13 



DOCID: <EP. 



' 0652560A1J_> 



EP 0 652 560 A1 



Fig.l 

100 



1 




2 




A 








r 


-j- 






A/ 








> 


/D 


1^ 







Fi g. 2 



200 



17 






ADJUST 


> 




_> 






14 



EP 0 652 560 A1 



31 31 



Fig. 3(A) _jiii~-~~im \ n ij— -~--"^m \ r j 
F i g . 3 (B) —•~iJi^~' ~-~fii\iu^ 



32 32 



F i g . A (A) J"-" i a]~-~-"llLi 

A2 A2 

F i g . A ( B)-J' '"lli'-'inMi"" ~ 



15 

06S2560A1J_> 



EP 0 652 560 A1 



Fi g.5(A) 



WD 



WD 



V(+) VOLTAGE 



D 



(-) ts*i 



C 



ts+2 



+ Lth 
_i 



B 



-Lih 



tb 



■^^V(jts) 



TIME 



ia 



Fig. 5(B) 




EP 0 652 560 A1 




CXXID: <EP 0652560A1 J_> 



17 



EP 0 652 560 A1 



Fig. 7 

( START ) r^2ST 




WAIT FOR 125 uSEC 



18 



EP 0 652 560 A1 



Ul I u 

8(A) JiFnili^ 



^ Ml M2 M3 ^ ^ M 

01 ( 02 / 03 / OA / On / 



' n ] n '-' n ) r -in 1 nr--ii. 



M2, 



Ml^jJ^MSj^Mn 01 0 2 03 - •- On 

8 ( B) jKji-ijij-inii-i--i fi]---i'-^r-~iF-"iJLii 



Fi g.9 



91 



92 



A/D 

CONVERSION 
MEANS 



93 
it. 



VOICELESS 
PART 

DETECTION 
MEANS 



-V: 



VOICELESS 
PART 
DELETION 
MEANS 



95 



COMPOfJENT 

CONTROL 

CODE 

CONVERSIONl 
MEANS 




RECOMBINATION 
MEANS 










97P 





.96 



•97 



19 



EP 0 652 560 A1 



Fi g. 10 



(start) 




Yes 



READ DATA 



PROCESS DATA HAVING SAME 
VALUE AS CONTROL .DATA 



.2b 



CALCULATE RANGE OF 
VOICELESS PART 



VOICELESS PART 
START FLAG ON 



( E N D J 




2m' 

STORE VOICELESS 
STATE DETECTION 
CODE AND NUMBER OF 2n 
BYTES OF VOICELESS 
SECTION 

< : 



VOICELESS PART 
START FLAG OFF 




20 



EP 0 652 560 A1 . 



Fig.11 




3g 



3c 



STORE VOICELESS PART 
DETECTION . CODE AND NUMBER 
OF •BY'IIES OF-i VOICELESS 
SECTION THAT? IS SET' TO iO JU: 





No 


READ THE DATA ' 

< : ■ ' • . 




' 3f 



E N D ^ 



3e 




Yes 



3h 



STORE ONLY VOICELESS PART 
DETECTION CONTROL CODE 
AND NUMBER OF BYTES OF 
VOICELESS SECTION 



STORE THE D'ATA 
THAT IS READ 



3i 



ADVANCE POINTER OF DATA BY 
THE NUMBER OF BYTES OF THE 
VOICELESS SECTION 



SOCIO: <EP 0e52S60At.l.> 



21 



EP 0 652 560 A1 



Fi g.12 



( START ) 



READ THE DATA 




CALCULATE COMPONENT . 
MULTIPLICATION FACTOR 



STORE ONLY VOICELESS PART 
DETECTION CONTROL CODE 
AND NUMBER OF BYTES OF 
VOICELESS SECTION. 




STORE PLUS/MINUS CONTROL CODE 
(F2h) AND MULTIPLICATION FACTOR 



STORE PLUS CONTROL CODE 
(FAh) AND MULTIPLICATION 
FACTOR 



/ 



STORE MINUS CONTROL CODE 
(F6h) AND MULTIPLICATION 
FACTOR 



/ 




(end) 



EP 0 652 560 A1 



Fi g. 13(A) 



MAX. VALUE 




Fi g.13(B) 




LARGER THAN A SPECIFIED 
VALUE (Wh) . 



MAX. VALUE 



FI g. 13(C) 



MAX. VALUE 



LARGER THAN A SPECIFIED 
VALUE (C^h) 



23 



EP 0 652 560 A1 



Fig.lA 






READ THE DATA 


1 H 


6b 




<I|SJHE DATA a''CX)NTROL CODE 




No 




6c 




ENCODING OF k BITS 



BASED ON COMPONENT AND 
MULTIPLICATION FACTOR 



6d 

1[S IT VOICE DATA OF 
^AN ODD NUMBER 

Tves 



TAKE ENCODED DATA INTO 
REFUGE IN THE UPPER 
U BITS 



Yes 



Yes 



6h 



IS IT THE E3>ID OF VOICE 
DATA OF AN ODD NUMBER 



Yes 



No 



6r 



STORE VOICELESS PART 
OOTPUT CONTROL CODE (Fh) 



STORE ONLY THOSE DATA 
THAT ACCOMPANY 
CONTROL CODE 



6j 



6f 



TAKE ENCODED DATA INTO 


REFUGE IN THE LOWER • 


4 BITS. "• 






6g 






STORE VOICE DATA AS 


A BYTE 






P4 



EP 0 652 560 A1 




XCIO: <EP 0652560A1_L> 



25 



INTERNATIONAL SEARCH REPORT 



Imemaiional applicatioo No. 

PCT/JP94/00661 



A. CLASSIFICATION OF SUBJECT MATTER 
Int. Cl^ G11B20/10 

According to Iniemaiional Pateni O assifi cation (I PQ or to both national da&sincatioo aod IPC 

B. FIELDS SEARCHED 

Mioicsuo] docuoeouiioo searched (ciiuincatioo system followed by classiricaiioo symbols) 

Int. Cl5 G11B20/10, G11B20/02 
DocumenudoD searched other thao minimum documenudoo to the exuot that sucb documeats are ioduded io the fields searched 

Jitsuyo Shinan Koho 1965 - 1993 

Kokai Jitsuyo Shinan Koho 1971 - 1993 

Elect/ooic dau base coosulted duriog the ioccnaboaal search (name of dau base aod, where practicable, search terms used) 



C. DOCUMENTS CONSODERED TO BE RELEVAJsTF 



Citcgory" 



CiLatiof) of docuosent, with indicatioo, where appropriate, of the relevant passages 



Relevant to daim No. 



JP, A, 59-195307 (Casio Computer Co., Ltd,)/ 
November S, 1984 (06. 11. 84), (Family: none) 

JP, A, 62-125577 (NEC Corp.), 

June 6, 1987 (06. 06- 87), (Family: none) 

JP, A, 60-35795 (Akai Electric Co., Ltd.), 
February 23, 1985 (23. 02. 85), (Family: none) 



1-3 
1-3 
4-9 



I I Further dooimcnis are listed in the contintiation of Box C. | | See patent faaiity annex. 



* Special categories of cited <locuaieaB: 

"A" dockiocoi deGai&c fcoeral sutcef the «n which is ooi eoosidcred 
to be of peniculaT reJevisoe 

"E" cjrUcf documeot bui puMisbed on or after the ioieraadooal fUio^datc 

"L" docuoeoi which may throw doobts oo pnoriry claitD(s) or which is 

cited IO establish the puNiciitoo dale of soother dutioa or other 

special reasoa (as speoGed) 

"O** documcot referria^ to an oral dtsclosufc, use, ejthjbiitoo or other 
meaos 

"P* documcot published prior to the iDlerDatiooaJ fUiogdau but later ihaa 

ctkc priori ry date '^''•im^4 



"T~ later documcDf puMisbed after the io tenia lioajJ Giiog dau or priority 
date aod ooi in ODoflia with the appliotioa but cited lo wodcrauad 
the priDciple or theory UDderiytag the ioveoiioo 

"X** documcot of paniciilar relevaooe: the daiiaed tovcattOD caiiaol be 
coaatdcrcd oovcl or caaoot be ooosidcred to iavolve aa ioveotive 
step wbeo the docnoMoi is taken alooc 

"T" documcoi of particalar Trievaoce; the claimed ioveadoD caoooi be 
ooosidcred to uivo4w jo ioveotive step wbeo the docutaeat is 
oooabioed with oae or more other such docuo3CDts,suchoombtaatioo 
bciag ot>vious to a persoo skilled ia the an 

"it" doojoeai member of the same pateat family 



Date of the actual oompleiioQ of the iotemational search 
July 18, 1994 (18. 07. 94) 


Date of mailing of the ioiernaiional search report 

August 9, 1994 (09. 08. 94) 


Naoae and mailiog address of the ISA/ 

Japanese Patent Office 

Facsimile No. 


Authorized officer 
Telephone No. 



Form PCT/lSA/210(secood sfaeei)(July 1992) 



